In this assignment, you will practice using the plotly express library. https://plotly.com/python/plotly-express/
Your goal is to recreate the following graphics below using plotly express. You should attempt to recreate them as close as possible.
You may work individually or with a group
import plotly.express as px
df = px.data.tips()
df.head()
| total_bill | tip | sex | smoker | day | time | size | |
|---|---|---|---|---|---|---|---|
| 0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 |
| 1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 |
| 2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 |
| 3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 |
| 4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 |
A barplot shows the relationship between a numeric and a categoric variable. Each entity of the categoric variable is represented as a bar. The size of the bar represents its numeric value.
fig = px.bar(df, x="sex", y="total_bill", color="smoker", barmode = "group")
fig.show()
import plotly.express as px
df = px.data.tips()
fig = px.bar(df, x="sex", y="total_bill", color="day", barmode = "group")
fig.show()
Scatter plots are used to check the relationship between the variables and the distribution of the data. A scatterplot displays the relationship between 2 numeric variables. For each data point, the value of its first variable is represented on the X axis, the second on the Y axis
import plotly.express as px
# using the dataset
df = px.data.tips()
# plotting the scatter chart
fig = px.scatter(df, x='total_bill', y="tip", color="sex", facet_col="smoker")
# showing the plot
fig.show()
#adding trend lines
import plotly.express as px
# using the dataset
df = px.data.tips()
# plotting the scatter chart
fig = px.scatter(df, x='total_bill', y="tip", color="sex", facet_col="smoker", trendline="ols")
# showing the plot
fig.show()
#adding trend lines
import plotly.express as px
# using the dataset
df = px.data.tips()
# plotting the scatter chart
fig = px.scatter(df, x='total_bill', y="tip", facet_col="day",facet_row="time",
category_orders={"day":["Thur","Fri","Sat","Sun"],"time":["Dinner","Lunch"]})
# showing the plot
fig.show()
A histogram takes as input a numeric variable only. The variable is cut into several bins, and the number of observation per bin is represented by the height of the bar. It is possible to represent the distribution of several variable on the same axis using this technique.
#Exploring distrubution of tip
fig = px.histogram(df, x="tip", marginal="rug")
fig.show()
A boxplot gives a nice summary of one or several numeric variables. The line that divides the box into 2 parts represents the median of the data
fig = px.box(df, x='smoker', y="tip", color="smoker")
# showing the plot
fig.show()